47491968 Yuankai Liu
47511663 Jianing Zhao
47564911 Yang Peng
This dataset is comprised of images about handwritten numbers from 0 to 9. The number of images for each number that we used is about 600, and there are more than 6000 images in total. Each image has 28 * 28 pixels.
The purpose of the dataset we selected is that we all have our own writing habits, so identifying handwritten numbers is a really valuable work for academia or for business. In our lab, we will try to make predictions to recognize the numbers in those images. We think some third parties would be interested in this results because it can reduce some simple but boring work for employees.
In this lab, we should ensure the correct rate is high enough, otherwise, it will be meaningless work. So we need to find a proper number of features for our algorithm.
In this part, the original dataset has 60000 images in total, and we randomly took out ten percent of images for our lab.
import matplotlib.pyplot as plt
import skimage.io as io
from skimage import data_dir
str1='digital/0/*.png'
coll0 = io.ImageCollection(str1)
import random
data=[]
y=[]
limit=10
for i in range(0,len(coll0)):
if random.randint(0, 100)<=limit:
data.append(coll0[i])
y.append(0)
coll1 = io.ImageCollection('digital/1/*.png')
for i in range(0,len(coll1)):
if random.randint(0, 100)<=limit:
data.append(coll1[i])
y.append(1)
coll2 = io.ImageCollection('digital/2/*.png')
for i in range(0,len(coll2)):
if random.randint(0, 100)<=limit:
data.append(coll2[i])
y.append(2)
coll3 = io.ImageCollection('digital/3/*.png')
for i in range(0,len(coll3)):
if random.randint(0, 100)<=limit:
data.append(coll3[i])
y.append(3)
coll4 = io.ImageCollection('digital/4/*.png')
for i in range(0,len(coll4)):
if random.randint(0, 100)<=limit:
data.append(coll4[i])
y.append(4)
coll5 = io.ImageCollection('digital/5/*.png')
for i in range(0,len(coll5)):
if random.randint(0, 100)<=limit:
data.append(coll5[i])
y.append(5)
coll6 = io.ImageCollection('digital/6/*.png')
for i in range(0,len(coll6)):
if random.randint(0, 100)<=limit:
data.append(coll6[i])
y.append(6)
coll7 = io.ImageCollection('digital/7/*.png')
for i in range(0,len(coll7)):
if random.randint(0, 100)<=limit:
data.append(coll7[i])
y.append(7)
coll8 = io.ImageCollection('digital/8/*.png')
for i in range(0,len(coll8)):
if random.randint(0, 100)<=limit:
data.append(coll8[i])
y.append(8)
coll9 = io.ImageCollection('digital/9/*.png')
for i in range(0,len(coll9)):
if random.randint(0, 100)<=limit:
data.append(coll9[i])
y.append(9)
print(len(data))
print(len(y))
Linearize the images to create a table of 1-D image features. Original image size is 28 * 28, so 784 is the number of features in 1-D images.
import numpy as np
a=np.array(data)
c=[]
for i in range(len(data)):
c.append(a[i].reshape(-1))
d=np.array(c)
from sklearn.cross_validation import train_test_split
X=d
y=np.array(y)
n_samples, n_features = X.shape
h,w=a[0].shape
n_classes=10
img=a
names=['0','1','2','3','4','5','6','7','8','9']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)
print("n_samples: {}".format(n_samples))
print("n_features: {}".format(n_features))
print("n_classes: {}".format(n_classes))
print("Original Image Sizes {} by {}".format(h,w))
print (28*28) # the size of the images are the size of the feature vectors
Below are several images selected randomly.
import random
def plot_gallery(images, titles, h, w, n_row=3, n_col=6):
"""Helper function to plot a gallery of portraits"""
a=0
b=len(X_train)
plt.figure(figsize=(1.7 * n_col, 2.3 * n_row))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
for i in range(n_row * n_col):
seq=random.randint(a,b)
plt.subplot(n_row, n_col, i + 1)
plt.imshow(images[seq].reshape((h, w)), cmap=plt.cm.gray)
plt.title(titles[seq], size=12)
plt.xticks(())
plt.yticks(())
plot_gallery(X_train, y_train, h, w)
Perform linear dimensionality reduction of the images using principal components analysis.
from sklearn.decomposition import PCA
n_components = 300
print ("Extracting the top %d eigennumbers from %d numbers" % (
n_components, X_train.shape[0]))
pca = PCA(n_components=n_components)
%time pca.fit(X_train.copy())
eigenfaces = pca.components_.reshape((n_components, h, w))
def plot_explained_variance(pca):
import plotly
from plotly.graph_objs import Scatter, Marker, Layout, XAxis, YAxis, Bar, Line
plotly.offline.init_notebook_mode() # run at the start of every notebook
explained_var = pca.explained_variance_ratio_
cum_var_exp = np.cumsum(explained_var)
plotly.offline.iplot({
"data": [Bar(y=explained_var, name='individual explained variance'),
Scatter(y=cum_var_exp, name='cumulative explained variance')
],
"layout": Layout(xaxis=XAxis(title='Principal components'), yaxis=YAxis(title='Explained variance ratio'))
})
import plotly
plotly.offline.init_notebook_mode()
Visualize the explained variance of each component.
plot_explained_variance(pca)
In this picture, we can learn that at least 150 components can adequately represent the image data. In our lab, we choosed 300 components. And the results are as below.
def plot_gallery2(images, titles, h, w, n_row=3, n_col=6):
"""Helper function to plot a gallery of portraits"""
plt.figure(figsize=(1.7 * n_col, 2.3 * n_row))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
for i in range(n_row * n_col):
plt.subplot(n_row, n_col, i + 1)
plt.imshow(images[i].reshape((h, w)), cmap=plt.cm.gray)
plt.title(titles[i], size=12)
plt.xticks(())
plt.yticks(())
eigenface_titles = ["eigennumber %d" % i for i in range(eigenfaces.shape[0])]
plot_gallery2(eigenfaces, eigenface_titles, h, w)
Above are some "base" images, which means they are only made by the most top useful eigen numbers.
n_components = 300
print ("Extracting the top %d eigennumbers from %d numbers" % (
n_components, X_train.shape[0]))
rpca = PCA(n_components=n_components,svd_solver='randomized')
%time rpca.fit(X_train.copy())
eigenfaces = rpca.components_.reshape((n_components, h, w))
eigenface_titles = ["eigennumber %d" % i for i in range(eigenfaces.shape[0])]
plot_gallery2(eigenfaces, eigenface_titles, h, w)
Above are also some "base" images,which means they are only made by random eigen numbers,and they are used to reconstruct images.
Perform non-linear dimensionality reduction of the image data.
%%time
from sklearn.decomposition import KernelPCA
n_components = 400
# print(np.sum(~np.isfinite(X)))
print ("Extracting the top %d eigennumbers from %d digital" % (n_components, X_train.shape[0]))
kpca = KernelPCA(n_components=n_components, kernel='rbf',
fit_inverse_transform=True, gamma=15, # very sensitive to the gamma parameter,
remove_zero_eig=True)
kpca.fit(X_train.copy())
import warnings
warnings.simplefilter('ignore', DeprecationWarning)
from ipywidgets import widgets # make this interactive!
# compare the different methods
def plt_reconstruct(idx_to_reconstruct):
idx_to_reconstruct = np.round(idx_to_reconstruct)
reconstructed_image = pca.inverse_transform(pca.transform(X_train[idx_to_reconstruct].reshape(1, -1)))
reconstructed_image_rpca = rpca.inverse_transform(rpca.transform(X_train[idx_to_reconstruct].reshape(1, -1)))
reconstructed_image_kpca = kpca.inverse_transform(kpca.transform(X_train[idx_to_reconstruct].reshape(1, -1)))
plt.figure(figsize=(15,7))
plt.subplot(1,4,1)
plt.imshow(X_train[idx_to_reconstruct].reshape((h, w)), cmap=plt.cm.gray)
plt.title(y_train[idx_to_reconstruct])
plt.grid()
plt.subplot(1,4,2)
plt.imshow(reconstructed_image.reshape((h, w)), cmap=plt.cm.gray)
plt.title('Full PCA')
plt.grid()
plt.subplot(1,4,3)
plt.imshow(reconstructed_image_rpca.reshape((h, w)), cmap=plt.cm.gray)
plt.title('Randomized PCA')
plt.grid()
plt.subplot(1,4,4)
plt.imshow(reconstructed_image_kpca.reshape((h, w)), cmap=plt.cm.gray)
plt.title('Kernel PCA')
plt.grid()
widgets.interact(plt_reconstruct,idx_to_reconstruct=(0,n_samples-1,1),__manual=True)
There are reconstructed images by different methods.
In this lab, intuitively, we prefer Full PCA because it runs much faster than Kernel PCA, and the reconstructed image is pretty good.
Below is the quantitative comparison. Build a nearest neighbor classifier to see actual classification performance. Every image finds the 10 nearest distance images, computes the times of label for each image. Then use the highest occurrences label to compare with the label of the original image.
%time X_train_pca = pca.transform(X_train)
%time X_train_rpca =rpca.transform(X_train)
%time X_train_kpca = kpca.transform(X_train)
%time dist_matrix_pca = pairwise_distances(X_train_pca)
%time dist_matrix_rpca= pairwise_distances(X_train_rpca)
%time dist_matrix_kpca= pairwise_distances(X_train_kpca)
idx=0
y_pre=[]
for j in range(len(X_train)):
pre=[0,0,0,0,0,0,0,0,0,0]
distance=copy.deepcopy(dist_matrix_pca[j,:])
for i in range(10):
distance[idx]=np.infty
idx=np.argmin(distance)
distance[idx]=100
pre[y_train[idx]]=pre[y_train[idx]]+1
y_pre.append(pre.index(max(pre)))
count=0
for i in range(len(X_train)):
if y_train[i]==y_pre[i]:
count=count+1
print(count/len(X_train))
idx=0
y_pre=[]
for j in range(len(X_train)):
pre=[0,0,0,0,0,0,0,0,0,0]
distance=copy.deepcopy(dist_matrix_rpca[j,:])
for i in range(10):
distance[idx]=np.infty
idx=np.argmin(distance)
distance[idx]=100
pre[y_train[idx]]=pre[y_train[idx]]+1
y_pre.append(pre.index(max(pre)))
count=0
for i in range(len(X_train)):
if y_train[i]==y_pre[i]:
count=count+1
print(count/len(X_train))
idx=0
y_pre=[]
for j in range(len(X_train)):
pre=[0,0,0,0,0,0,0,0,0,0]
distance=copy.deepcopy(dist_matrix_kpca[j,:])
for i in range(10):
distance[idx]=np.infty
idx=np.argmin(distance)
distance[idx]=100
pre[y_train[idx]]=pre[y_train[idx]]+1
y_pre.append(pre.index(max(pre)))
count=0
for i in range(len(X_train)):
if y_train[i]==y_pre[i]:
count=count+1
print(count/len(X_train))
The results of actual classification performance also support our preference.
Perform feature extraction upon the images using DAISY feature extraction technique.
from skimage.io import imshow
idx_to_reconstruct = int(np.random.rand(1)*len(X))
img1 = X_train[idx_to_reconstruct].reshape((h,w))
imshow(img1)
plt.grid()
from skimage.feature import daisy
# lets first visualize what the daisy descripto looks like
features, img_desc = daisy(img1,step=40, radius=10, rings=3, histograms=5, orientations=8, visualize=True)
imshow(img_desc)
plt.grid()
features = daisy(img1,step=10, radius=10, rings=2, histograms=4, orientations=8, visualize=False)
print(features.shape)
print(features.shape[0]*features.shape[1]*features.shape[2])
def apply_daisy(row,shape):
feat = daisy(row.reshape(shape),step=10, radius=10, rings=2, histograms=6, orientations=8, visualize=False)
return feat.reshape((-1))
%time test_feature = apply_daisy(X_train[3],(h,w))
test_feature.shape
%time daisy_features = np.apply_along_axis(apply_daisy, 1, X_train, (h,w))
print(daisy_features.shape)
from sklearn.metrics.pairwise import pairwise_distances
%time dist_matrix = pairwise_distances(daisy_features)
Compute the pairwise distance between all the different image features, and then compute the nearest distance to find the closest imange for the original image.
import copy
# find closest image to current image
idx1 = 5
distances = copy.deepcopy(dist_matrix[idx1,:])
distances[idx1] = np.infty
idx2 = np.argmin(distances)
plt.figure(figsize=(7,10))
plt.subplot(1,2,1)
imshow(X_train[idx1].reshape((h,w)))
plt.title("Original Image")
plt.grid()
plt.subplot(1,2,2)
imshow(X_train[idx2].reshape((h,w)))
plt.title("Closest Image")
plt.grid()
from ipywidgets import fixed
# put it together inside a nice widget
def closest_image(dmat,idx1):
distances = copy.deepcopy(dmat[idx1,:]) # get all image diatances
distances[idx1] = np.infty # dont pick the same image!
idx2 = np.argmin(distances)
distances[idx2] = np.infty
idx3 = np.argmin(distances)
plt.figure(figsize=(10,16))
plt.subplot(1,3,1)
imshow(X_train[idx1].reshape((h,w)))
plt.title("Original Image "+names[y_train[idx1]])
plt.grid()
plt.subplot(1,3,2)
imshow(X_train[idx2].reshape((h,w)))
plt.title("Closest Image "+names[y_train[idx2]])
plt.grid()
plt.subplot(1,3,3)
imshow(X_train[idx3].reshape((h,w)))
plt.title("Next Closest Image "+names[y_train[idx3]])
plt.grid()
widgets.interact(closest_image,idx1=(0,len(X_train)-1,1),dmat=fixed(dist_matrix),__manual=True)
Build a nearest neighbor classifier to see actual classification performance.
Every image finds the 10 nearest distance images, computes the times of label for each image. Then use the highest occurrences label to compare with the label of the original image.
idx=0
y_pre=[]
for j in range(len(X_train)):
pre=[0,0,0,0,0,0,0,0,0,0]
distance=copy.deepcopy(dist_matrix[j,:])
for i in range(10):
distance[idx]=np.infty
idx=np.argmin(distance)
distance[idx]=100
pre[y_train[idx]]=pre[y_train[idx]]+1
y_pre.append(pre.index(max(pre)))
count=0
for i in range(len(X_train)):
if y_train[i]==y_pre[i]:
count=count+1
print(count/len(X_train))
In this lab, the correct rate is 95.58%, so this classifier performs pretty well.
from skimage.filters import gabor_kernel
from scipy import ndimage as ndi
from scipy import stats
# prepare filter bank kernels
kernels = []
for theta in range(4):
theta = theta / 4. * np.pi
for sigma in (1, 3):
for frequency in (0.05, 0.25):
kernel = np.real(gabor_kernel(frequency, theta=theta,
sigma_x=sigma, sigma_y=sigma))
kernels.append(kernel)
# compute the filter bank and take statistics of image
def compute_gabor(row, kernels, shape):
feats = np.zeros((len(kernels), 4), dtype=np.double)
for k, kernel in enumerate(kernels):
filtered = ndi.convolve(row.reshape(shape), kernel, mode='wrap')
_,_,feats[k,0],feats[k,1],feats[k,2],feats[k,3] = stats.describe(filtered.reshape(-1))
# mean, var, skew, kurt
return feats.reshape(-1)
idx_to_reconstruct = int(np.random.rand(1)*len(X_train))
gabr_feature = compute_gabor(X_train[idx_to_reconstruct], kernels, (h,w))
gabr_feature
# takes ~3 minutes to run entire dataset
%time gabor_stats = np.apply_along_axis(compute_gabor, 1, X_train, kernels, (h,w))
print(gabor_stats.shape)
from sklearn.metrics.pairwise import pairwise_distances
# find the pairwise distance between all the different image features
%time dist_matrix_gabor = pairwise_distances(gabor_stats)
widgets.interact(closest_image,idx1=(6),dmat=fixed(dist_matrix_gabor),__manual=True)
print(dist_matrix_gabor.shape)
y_pre=[]
for j in range(len(X_train)):
pre=[0,0,0,0,0,0,0,0,0,0]
distance=copy.deepcopy(dist_matrix_gabor[j,:])
for i in range(10):
distance[idx]=np.infty
idx=np.argmin(distance)
distance[idx]=100
pre[y_train[idx]]=pre[y_train[idx]]+1
y_pre.append(pre.index(max(pre)))
count=0
for i in range(len(X_train)):
if y_train[i]==y_pre[i]:
count=count+1
print(count/len(X_train))
In this part, we use Gabor Kernels for feature extraction, the result is 72.64%, which is worse than the result in 3.5. We think that it may due to the type of images. Different type of images have different distributions of pixels, pixels of our number images are characteristic, so we should grasp several methods and use them in different conditions.